Kheirbek et at. Environmental Health 2012, 11:51 
http://www.ehjournal.net/content/1 1/1/51 



ENVIRONMENTAL HEALTH 



RESEARCH Open Access 



Spatial variability in levels of benzene, 
formaldehyde, and total benzene, toluene, 
ethylbenzene and xylenes in New York City: 
a land-use regression study 

lyad Kheirbek^" Sarah Johnson\ Zev Ross^, Grant Pezeshki\ Kazuhiko lto\ Holger EisP and Thomas Matte^ 
Abstract 

Background: Hazardous air pollutant exposures are common in urban areas contributing to increased risk of 
cancer and other adverse health outcomes. While recent analyses indicate that New York City residents experience 
significantly higher cancer risks attributable to hazardous air pollutant exposures than the United States as a whole, 
limited data exist to assess intra-urban variability in air toxics exposures. 

Methods: To assess intra-urban spatial variability in exposures to common hazardous air pollutants, street-level 
air sampling for volatile organic compounds and aldehydes was conducted at 70 sites throughout New York City 
during the spring of 201 1. Land-use regression models were developed using a subset of 59 sites and validated 
against the remaining 1 1 sites to describe the relationship between concentrations of benzene, total BTEX 
(benzene, toluene, ethylbenzene, xylenes) and formaldehyde to indicators of local sources, adjusting for 
temporal variation. 

Results: Total BTEX levels exhibited the most spatial variability, followed by benzene and formaldehyde (coefficient 
of variation of temporally adjusted measurements of 0.57, 0.35, 0.22, respectively). Total roadway length within 
100 m, traffic signal density within 400 m of monitoring sites, and an indicator of temporal variation explained 
65% of the total variability in benzene while 70% of the total variability in BTEX was accounted for by traffic signal 
density within 450 m, density of permitted solvent-use industries within 500 m, and an indicator of temporal 
variation. Measures of temporal variation, traffic signal density within 400 m, road length within 100 m, and 
interior building area within 100 m (indicator of heating fuel combustion) predicted 83% of the total variability 
of formaldehyde. The models built with the modeling subset were found to predict concentrations well, 
predicting 62% to 68% of monitored values at validation sites. 

Conclusions: Traffic and point source emissions cause substantial variation in street-level exposures to common 
toxic volatile organic compounds in New York City. Land-use regression models were successfully developed 
for benzene, formaldehyde, and total BTEX using spatial indicators of on-road vehicle emissions and emissions 
from stationary sources. These estimates will improve the understanding of health effects of individual pollutants 
in complex urban pollutant mixtures and inform local air quality improvement efforts that reduce disparities 
in exposure. 
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Background 

Despite regulatory controls, urban populations are exposed 
to toxic air pollutants with potential to cause cancer or 
other serious health effects. The 1999 Amendments to the 
Clean Air Act identified 187 hazardous air pollutants 
(HAPs) subject to emissions based controls due to health 
effects associated with ambient exposures [1]. These regu- 
lations include controls on 174 stationary source categories 
to meet maximum achievable control technology standards 
and mobile source air toxics rules that reduce vehicle emis- 
sions through fuel controls, including lowering limits on 
benzene in gasoline beginning in 2011 [2]. 

HAPs commonly found in urban areas include formal- 
dehyde and a group of aromatic volatile organic com- 
pounds (VOC): benzene, toluene, ethylbenzene, xylene 
(together known as BTEX). Among these, benzene and 
formaldehyde are classified by the International Agency 
for Research on Cancer as human carcinogens (Group 
1); both are key drivers of estimated cancer risk from or- 
ganic HAPs in the US [3,4]. Other BTEX compounds- 
toluene, ethylbenzene, and xylene— have been found to 
produce adverse health effects including respiratory and 
neurological effects [5-7] and react to form secondary 
organic aerosols, contributing to ambient fine particulate 
matter (PM2.5) [8]. BTEX and formaldehyde also play 
important roles in the photochemical reactions that 
form ozone [9]. 

Recent analyses suggest that 49% of New York City 
residents live in census tracts exceeding the 1 in 10,000 
HAP-attributable cancer risk benchmark compared to 
4.8% of the population nationwide, with the majority 
of the risk attributed to benzene and formaldehyde 
exposures [10,11]. Primary local sources of BTEX are 
on-road and non-road gasoline vehicles and engines, 
with emissions from petroleum transport/storage and 
solvent usage also making substantial contributions [12]. 
On- and non-road gasoline and diesel vehicles and 
engines are also predominant sources of primary formal- 
dehyde emissions in NYC with additional contributions 
from stationary-source fuel combustion [12]. Formalde- 
hyde is also formed secondarily by photooxidation of 
hydrocarbons. Ambient formaldehyde levels in New 
York City have been observed to peak in summer 
months, likely due to seasonal increases in photochem- 
ical activity [13]. 

While national air toxics regulations have reduced 
exposures, the limited number of monitoring sites in 
urban areas restricts the ability to assess spatial variation 
in concentrations within cities for developing local con- 
trol policies. For example, in New York City there are 
currently six regulatory monitors reporting VOC mea- 
surements and five reporting aldehydes, with monitors 
operating only every sixth day [14]. While this network 
provides valuable information on air toxic trends useful 



in evaluating exposure and regulating ozone, they are 
not sufficient to understand fine scale intra-urban spatial 
variation in concentrations due to localized sources such 
as traffic [15,16]. 

Recently, land-use regression (LUR) models have been 
increasingly used to estimate intra-urban spatial vari- 
ability of air pollutants and in developing exposure esti- 
mates for epidemiological research [17,18]. They have 
been used in New York City to develop exposure esti- 
mates for fine particulate matter (PM2.5), oxides of nitro- 
gen (NOx), and sulfur dioxide (SO2) (Clougherty et al. 
submitted 2011, [19]). While many LUR studies focus on 
nitrogen dioxide NO2 and PM2.5, they have also been 
used to estimate BTEX concentrations [16,20-23]. 

This paper evaluates spatial variation in benzene, 
total BTEX and formaldehyde concentrations across 
New York City using a saturation sampling campaign 
conducted in the spring of 2011 and land-use regres- 
sion modeling. 

Methods 

Spatial and temporal allocation of sites 

BTEX and formaldehyde monitoring was conducted at a 
subset of the 150 sites routinely monitored for PM2.5, 
elemental carbon, PM2.5 constituents, NOx, SO2 and 
ozone throughout NYC as part of the New York City 
Community Air Survey (NYCCAS) network, an initiative 
within the Citys sustainability plan, PlaNYC [24]. The 
NYCCAS monitoring network sites were selected to cap- 
ture the range in variation of key local emissions sources 
while providing adequate spatial coverage throughout 
the City. A description of the selection process for these 
150 sites is described elsewhere (Matte et al. submitted 
2011). In short, 120 sites were selected for monitoring 
through stratified random sampling of 7,756 300 m x 
300 m grid cells with oversampling in areas of high traf- 
fic and high building density- indicators of two categor- 
ies of important local emissions sources- to account for 
skewed distributions of these source proxies within New 
York City. We chose building density rather than popu- 
lation density as an indicator of source activity suitable 
for both residential and commercial areas of the city. 
Thirty additional sites were selected to fill spatial gaps 
and capture areas of interest. 

Of the original 150 sites, we selected 70 sites for air 
toxics monitoring (referred to as "distributed" sites) by 
first retaining 21 sites that were geographically isolated 
from other monitoring locations or had produced high 
residuals in our prior statistical models for NOx, SO2, 
PM2.5, and EC. These sites were included to ensure 
that the monitoring captured a full range of traffic and 
land-use settings. We then randomly selected from the 
remaining available sites. We compared the distribu- 
tions of these 70 sites in relation to traffic and building 
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density to the distribution in the original 150 sites to 
confirm that similar coverage of major source density 
was achieved in the subset of sites selected for air toxics 
monitoring (Table 1). Three reference sites were selected 
in parks, away from major sources, in Central Park in 
Manhattan, Queens College in Queens, and La Tourette 
Golf Course in Staten Island (Figure 1). 

We collected samples of BTEX and formaldehyde at 
each of the 70 distributed sites, 14 of which were allo- 
cated at random to each of five two-week sessions, from 
3/22/2011 to 6/1/2011. At the three reference sites, sam- 
ples were collected during all five sessions to assess city- 
wide temporal variation related to meteorology. 

Air sampling and analysis 

Formaldehyde and BTEX compounds were measured 
with Radiello radial passive sampling tubes (Fondazione 
Salvatore Maugeri, Padova, Italy). Samplers were placed 
in weather protective shelters and mounted at 10 feet 
onto street-side signal and lamp posts. Formaldehyde 
measurements were taken for 1-week while BTEX mea- 
surements were conducted for 2-weeks to meet sampler 
manufacturers sample time specifications [25,26]. 

Passive BTEX samplers contained activated charcoal 
that collects VOCs by adsorption. Sample analysis 
was conducted by Air Toxics Limited (Folsom, CA) by 
extraction with carbon disulfide and analyzed using gas 
chromatography with mass spectrometry (GCMS). GCMS 
identified five BTEX compounds: benzene, toluene, ethyl- 
benzene, o-xylene, and m/^-xylene, which were summed 
to compute the total BTEX concentration. These sam- 
plers have been used in VOC field monitoring campaigns 
[27-29] as well as prior LUR studies [20]. 

Passive aldehyde samplers contained 2,4-dinitrophe- 
nylhydrazine (2,4-DNPH) coated silica which converts 
aldehydes to stable hydrazone derivatives, 2,4-dinitrophe- 
nylhydrazone. Sample analysis was performed by Air 
Toxics Limited (Folsom, CA) by extracting hydrazones 
with acetonitrile and analyzing using reverse phase high- 
pressure liquid chromatography with ultra-violet detection 
at 360 nm (HPLC-UV). Passive sampling by 2,4-DNPH 

Table 1 Distribution of traffic and building density at 



NYCCAS network sites and Air Toxics sampling sites 







Air Toxics 
(n = 70) 


Full NYCCAS 
(n = 150) 


Building Density 


Traffic Density 


Count Percent 


Count Percent 


High 


High 


16 23% 


34 23% 


Norm 


High 


14 20% 


35 23% 


High 


Norm 


20 29% 


36 24% 


Norm 


Norm 


20 29% 


45 30% 



High Is defined as highest quartile of citywide 300 m X 300 m lattice values. 




Figure 1 Map of New York City Community Air Survey sites 
monitored for BTEX compounds and formaldehyde. 

V . ) 



derivitaziation has been evaluated and applied extensively 
in ambient formaldehyde monitoring studies [30-32]. 

Quality assurance 

During each sampling session one field blank was placed 
unopened at the La Tourette reference site for the dur- 
ation of the session and analyzed alongside all other 
samplers. At two sites in each session, two sets of 
samplers were deployed side by side to assess differences 
in collocated samplers. Laboratory quality control pro- 
cedures followed guidelines established for passive 
VOC and aldehyde monitoring by the sampler manufac- 
turer using standard EPA and OSHA methodologies 
[33,34]. For each pollutant, descriptive statistics were 
computed by session to identify potential outliers for 
further investigation. 

Data analysis 
Descriptive analysis 

We computed descriptive statistics across all distributed 
and reference site measurements and compared concen- 
trations to those reported during the same time period 
at rooftop regulatory monitors [14]. Raw measurements 
were then adjusted for temporal variation by dividing 
the distributed site measurements by the mean reference 
value in each session then multiplying this ratio by the 
mean of reference sites across the entire period. We 
described spatial variability by computing the coefficient 
of variation (CV) of temporally adjusted measurements 
across all sessions. We examined spatial distributions 
within each session by computing the CV (based on 
unadjusted values) within each session and examining 



Kheirbek et at. Environmental Health 2012, 11:51 
http://www.ehjournal.net/content/1 1/1/51 



Page 4 of 1 2 



plots of monitored concentrations, session means, and 
reference site means. To assess temporal variation, we 
regressed raw distributed site concentrations on session- 
specific means of reference sites, and used the R-squared 
(R^) as the indicator of temporal variation (referred to as 
"temporal R^" in Results section). 

Geographic variables 

Spatial data on emission source indicators were collected 
and analyzed using ArcGIS 9.2 (ESRI, Redlands CA). 
These datasets were obtained from a variety of public 
and private sources and encompassed a range of data 
types and resolution from highly resolved road network 
line data to traffic volume modeled along "links" be- 
tween destinations. Source indicator categories included 
total and road-specific measures of traffic, mobile source 
diesel combustion, population metrics, built space area, 
land-use type, and emissions permits from point sources, 
transportation facilities, and waste treatment and trans- 
fer facilities (Table 2). City-issued permits on point 
sources were filtered by searching the business descrip- 
tion field using keywords derived from the EPA National 
Emissions Inventory [12] of processes known to produce 
the air toxics of interest. For each indicator, covariates 
were calculated within 15 buffers surrounding each 
monitoring location, at distances of 50 to 1000 meters. 
Detailed descriptions of the GIS datasets used to develop 
source indicators for NYCCAS analyses are available in 
Additional file 1: Table SI. 

LUR model building process 

Prior to modeling, concentrations among the three refer- 
ence sites across the five sampling sessions were exam- 
ined for similarity in temporal patterns. For benzene, 
while two reference sites were highly correlated (Pear- 
sons Correlation (r) = 0.84), one site showed low correl- 
ation with the others (r = 0.13 and -0.18) potentially 
indicating local source influence on temporal variation at 
that specific site. This sites benzene measurements were 



removed to avoid distortion or bias in temporal adjust- 
ment. Raw concentrations were then used as the 
dependent variable in the model building process and 
each sessions mean pollutant concentration at the refer- 
ence sites was added as a covariate [35] to adjust for city- 
wide temporal variation due to meteorology while expli- 
citly accounting for error in estimating the temporal term. 

Source indicator variables were grouped into six emis- 
sion indicator-based categories: total traffic density, truck 
and bus traffic, permitted combustion-related emissions 
from point sources, built space density, population dens- 
ity, non-combustion permitted emissions (solvent use, 
petroleum/chemical bulk storage). For each pollutant, 
we used a Pearsons correlation matrix to select the 
two buffer specific variables within each category most 
correlated with temporally adjusted pollutant concentra- 
tions. Each of these two variables was paired with a 
second category-specific term that optimized the R^ in 
a two-variable model against the pollutant concentra- 
tion. This resulted in a total of four candidate covari- 
ates per category that were considered in subsequent 
model building. 

We foUowed a manual forward step-wise model- 
buflding process using reference site concentrations, 
emissions source covariates, and site characteristics. 
Models were first fit using a randomly selected "model- 
ing subset" of 85% (n = 59) of distributed sites and the 
resulting provisional models were validated by com- 
paring predicted values with measured values at the 
remaining 15% (n = ll) of sites. Model diagnostics, 
including studentized residuals and Cooks distance 
values, were inspected for outliers and highly influential 
points and models were evaluated for coherence with 
known emission source patterns and for sensitivity to 
alternative emission source indicators. Once the pro- 
visional models were validated, raw measurements from 
all 70 sites were used to produce final model parameters 
describing the spatial and temporal variabUity in pollu- 
tant concentrations and for predictions of seasonal mean 



Table 2 Summary of GIS-based source indicators 



Source Category 


Variables 


Data Sources 


Traffic Indicators 


Un-weighted and kernel-weighted road and traffic 
density, number of signaled intersections, distance 
to and characteristics of nearest roadway 


New York Metropolitan Transportation Council, Highway 
Performance Monitoring System, Accident Location Information 
System, Market Planning Solutions TrafficMetrix data, NYC 
Department of Transportation Truck Routes 


Population Metrics 


Census population density, LandScan population density 


2000 US Census, Oak Ridge National Laboratory LandScan™ 


Built Space 


Density of built space by land use category 


NYC Department of City Planning Primary LandUse Tax Lot 
Output (PLUTO™) 


Permitted Emissions 


Permitted combustion sources, solvent use industries NYS Department of Environmental Conservation, NYC 
(excluding dry cleaning), petroleum bulk storage locations Department of Environmental Protection 


Transportation and waste School bus depots, waste transfer stations, wastewater 
transfer facilities treatment facilities, marine terminals, airports 


NYC Department of Citywide Administrative Services, NYC 
Department of Education, NYC Department of Sanitation, 
NYC Office of Emergency Management 



Calculated within 50 m buffers between 50 to 500 meters and 100 m buffers between 500 to 1000 meters. 
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values. After building the final model we computed an 
additional purely spatial model that regressed the 
temporally-adjusted pollutant concentrations onto the 
final set of spatial source terms to confirm that both 
temporal adjustment strategies produced comparable 
results. The overall fit of this model is reported in the 
Results section as the amount of spatial variability 
explained by the model. 

Results 

Descriptive statistics 

Across 10 weeks of monitoring, 70 sites were sampled 
successfully for formaldehyde while 69 of 70 scheduled 
sites were sampled successfully for BTEX compounds 
due to a field error where a sampler was not deployed to 
one site scheduled for monitoring. Measurements in all 
samples exceeded the limit of quantification (LOQ) for 
BTEX compounds and formaldehyde. Field blank con- 
centrations were below the LOQ for all BTEX com- 
pounds and all but one formaldehyde sample. Collocated 
samples (n = 10) showed good agreement with mean 
absolute percent differences of 10.9%, 8.0%, and 4.6% 
and of 0.80, 0.94, and 0.98 for benzene, BTEX, and 
formaldehyde, respectively. One formaldehyde result was 
removed from the analysis because of implausibly high 
concentrations. This yielded 69 total benzene, BTEX 
and formaldehyde samples from distributed sites used in 
further analyses. 

Street-side concentrations of all pollutants were higher 
on average than reference site concentrations while aver- 
age benzene and BTEX levels at distributed sites showed 
higher concentrations and wider ranges than those 
reported at regulatory monitoring sites during the same 
period (Table 3). Average formaldehyde levels from dis- 
tributed sites were slightly lower than average regulatory 
site measurements due to one regulatory monitor 
reporting high concentrations for several days during 
the campaign. 

Spatial variability, estimated by the CV across all tem- 
porally adjusted measurements, was greatest for BTEX, 
followed by benzene, then formaldehyde (CV of 0.57, 
0.35, 0.22, respectively). Benzene and BTEX concentra- 
tions showed little temporal variation; 8% and 3% of vari- 
ance, respectively, was explained by session (Figure 2). 
Formaldehyde showed the most city-wide temporal 



variability (temporal R^ = 46%), with levels generally 
increasing as the season progressed and temperatures 
increased (Figure 2). Temporally adjusted concentrations 
were spatially correlated across all three pollutants with 
slightly better correlation between benzene and total 
BTEX or formaldehyde (r = 0.73) than formaldehyde and 
BTEX (r =0.69). 

Modeling results 
Benzene 

Predicted concentrations from the provisional model 
explained 62% of the variance in concentrations at the 
validation sites. Spatial and temporal variability of ben- 
zene was associated with, in order of importance based 
on partial R^, traffic signal density within 400 m of the 
monitors, length of interstate, state, and county high- 
ways within 100 m, and the reference site mean. The 
bivariate relationships between the spatial model terms 
and temporally adjusted concentrations demonstrated 
consistent positive associations across all 69 monitoring 
sites (Figure 3). Including all 69 sites in the final model 
showed that after controlling for other model terms, 
an inter-quartile range (IQR) increase in traffic signal 
density (an indicator of vehicle traffic and congestion) 
was associated with an increase in benzene concentra- 
tion of 0.32 (ig/m^ while an IQR increase in road length 
was associated with an average increase in benzene of 
0.15 (ig/m^. These terms describe 60% of the spatial vari- 
ability (not shown) of benzene across all monitoring sites 
and, together with the reference site means, 65% of 
the temporal and spatial variation in benzene (Table 4, 
Figure 4). 

BTEX 

Two sites showed high studentized residuals (>8) and 
high Cooks distance values (>0.6) potentially indicating 
unusual emissions patterns near the site. These sites, 
located in the industrial areas of the South Bronx, were 
not outliers for benzene and formaldehyde, but showed 
very high levels of toluene, ethylbenzene, and the 
xylenes. To avoid distortion of the final, city-wide model, 
we elected to remove these sites from the final model. 
Predicted concentrations from the provisional model 
explained 65% of the variance in concentrations at the 
validation sites. The bivariate relationships between 



Table 3 Summary statistics for pollutant concentrations at NYCCAS sites and rooftop regulatory monitoring sites from 
3/22/2011-6/1/2011 

Distributed Sites Reference Sites Regulatory Sites 

n Mean {[xg/m^) Range (ng/m^) n Mean (|jg/m^) Range (|ag/m^) n Mean {{xg/m^) Range {{xg/m^) 

Benzene 69 0.82 0.34-2.3 3 0.52 0.50-0.58 6 0.65 0.50-0.76 

BTEX 69 4.66 1.52-20.4 3 2.35 2.05-2.72 6 3.58 2.58-4.97 

Formaldehyde 69 2.21 1.20-3.70 3 1.83 1.62-2.04 5 2.33 1.16-4.31 
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Figure 2 Distribution of two-weeic average benzene and BTEX and one-weel< average formaldehyde concentrations with average 
session temperatures measured at monitoring sites. 
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these spatial model terms and temporally adjusted con- 
centrations confirmed that consistent positive associa- 
tions were observed across all 67 sites (Figure 3). Spatial 
and temporal variability of BTEX compounds was asso- 
ciated with, in order of importance based on partial R^, 
traffic signal density within 450 m of the monitors, 
kernel-weighted density of solvent-use industries within 
500 m, and reference site mean. The final model that 
included all 67 sites showed an IQR increase in traffic 
signal density was associated with an increase in BTEX 
concentration of 1.62 (ig/m^ while an IQR increase in 
density of permitted solvent-use industries was asso- 
ciated with an increase in BTEX concentration of 
0.52 (ig/m^. These terms described 64% of the spatial 
variability (not shown) in BTEX across all monitoring 
sites and, in combination with the reference site means, 
explained 70% of the spatial and temporal variation in 
BTEX (Table 4, Figure 4). 



Formaldehyde 

Predicted concentrations from the provisional model 
explained 68% of the variance in concentrations at the 
validation sites. Spatial and temporal variability of for- 
maldehyde was associated with, in order of importance 
based on partial R^, reference site mean, traffic signal 
density within 400 m of the monitors, length of roads 
within 100 m, and interior built space within 100 m. 
The bivariate relationships between these spatial model 
terms and temporally adjusted concentrations demon- 
strated consistent positive associations across all 69 
monitoring sites (Figure 3). The final model that 
included all 69 sites showed an IQR increase in signal 
density was associated, on average, with an increase of 
0.36 (ig/m^ formaldehyde, an IQR increase in interior 
built space density (index of amount of fuel combustion 
for heating) was associated with an increase of 0.08 (ig/ 
m^, and an IQR increase in road density was associated 
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Figure 3 Scatterplots of GIS covariates and temporally adjusted concentrations. 
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Table 4 Land-use regression model results for benzene, BTEX, and formaldehyde. Final model terms listed in order of 
importance based on partial 
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1.209 


0.119 


<.0001 




0.28 


Number of signals within 400 meters 


0.020 


0.004 


<.0001 




0.07 


Road length within 100 meters (km) 


0.561 


0.112 


<.0001 




0.07 


Built space within 100 meters (km^) 


2.477 


0.716 


0.001 




0.03 



an increase of 0.19 (ig/m^. These terms described 69% of 
the spatial variation (not shown) in formaldehyde across 
all monitoring sites, and in combination with the refer- 
ence site means, they described 83% of the spatial and 
temporal variation (Table 4, Figure 4). 

Discussion 

This study demonstrates significant intra-urban spatial 
variability in ambient levels of benzene, total BTEX, and 
formaldehyde across New York City monitoring sites, 
with the widest range in concentrations found in total 
BTEX. Within the season, we observed limited temporal 
variability for benzene and BTEX while formaldehyde 
levels increased with increasing average temperatures. 
Land-use regression models explained 65%, 70%, and 
83% of the total variability of benzene, BTEX, and for- 
maldehyde, respectively with temporal terms and spatial 
variables representing traffic density, solvent-use indus- 
tries and built space. The provisional models built with 
the modeling subset were found to predict concentra- 
tions well, predicting 62% to 68% of monitored values at 
validation sites. 

Average benzene and BTEX levels were higher than 
those measured at rooftop regulatory monitors during 
the study period, reflecting closer proximity of NYCCAS 
monitoring sites to traffic sources. Prior NYC-based 
monitoring studies of air toxics showed higher ambient 
levels of benzene and BTEX at residential sites mainly in 
the Bronx and Northern Manhattan than levels reported 
here [13,36]. This is likely explained by overall decreases 
in concentrations in NYC and nationwide over the past 



decade as well as relatively higher levels of traffic related 
pollutants in Northern Manhattan and the Bronx com- 
pared to the city overall [14,37]. Associations of benzene 
and BTEX concentrations with high traffic density are 
consistent with prior monitoring studies [23,38,39]. 

We found that variables specific to traffic congestion 
and volume best explained the spatial variability of ben- 
zene, with traffic volume indicated through total road 
lengths around monitoring sites and indicators of traffic 
density and congestion represented by traffic signal 
density. These variables were consistent with known 
sources of benzene in NYC, where gasoline vehicles are, 
collectively, the predominant source [12]. Prior LUR 
models for benzene have shown similar results, although 
some included additional terms related to petroleum 
usage, proximity to point sources, and population dens- 
ity [16,21-23]. The association of benzene concentra- 
tions with traffic within 400 meters of monitoring 
locations is consistent with observations that increased 
benzene levels near roadways decay to background 
within around 300 meters [40]. In contrast to many 
prior LUR studies, we chose to address temporal vari- 
ation by using raw unadjusted concentrations as the 
dependent variable and the reference site mean as a cov- 
ariate with the spatial covariates in the model. The ad- 
vantage of this approach over a model in which 
temporally adjusted values are regressed onto spatial 
covariates is that, in estimating the slope for emission 
source terms, it adjusts for city-wide temporal variation 
due to meteorology while explicitly accounting for error 
in estimating the temporal term. 
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The correlates of spatial variability in total BTEX we 
observed in New York City are also consistent with known 
local emission sources including traffic and solvent usage 
[12] and with prior studies linking higher BTEX concen- 
trations to traffic as well as distance to VOC emitting 
point sources [20,21,41]. Likely due to limited geographic 
distribution throughout the city, we did not find associa- 
tions with large point sources reported in the National 
Emissions Inventory [12] and Toxics Release Inventory 
[42] or petroleum storage facilities. We did however find 
associations with density of nearby facilities too small to 
require Title V permits, but permitted by the City to use 
solvents in industries known to produce BTEX com- 
pounds such as spray booths, graphics industries, and auto 
body and detailing shops. These facilities are distributed 
throughout many neighborhoods, although more concen- 
trated in industrial areas. An important limitation of our 
data is the lack of detailed information on solvent type 
and quantity at these smaller permitted facilities. Add- 
itional sampling near different types of facilities and 
improved emissions data or proxies could help elucidate 
these patterns in future work. 

Formaldehyde measurements showed less spatial vari- 
ability than benzene and total BTEX, compatible with 
findings from prior intra-urban analyses of data from 
national monitoring networks [43]. We found more tem- 
poral variability in formaldehyde with levels increasing 
with higher average temperatures. These findings are 
consistent with studies indicating higher temperature 
and longer daylight hours increase photochemical for- 
mation of secondary formaldehyde and levels peak dur- 
ing warm months and mid-day periods [43-45]. To our 
knowledge there have been no published LUR models 
for formaldehyde. The predictors of spatial variation 
found are consistent with known sources of local pri- 
mary ambient formaldehyde with higher levels found in 
areas of increased traffic emissions and interior built 
space indicating increased fuel combustion related to 
space and water heating. 

This study indicates that LUR modeling can be applied 
successfully to predicting benzene, BTEX, and formalde- 
hyde levels for use in exposure assessment and epi- 
demiological research in complex urban environments 
like New York City. Prior VOC and aldehyde exposure 
assessments have applied modeled data from EPAs 
National Air Toxics Assessment (NATA) [3,46-48], regu- 
latory monitoring data [49,50], and combinations 
of fixed site and personal monitoring [13,41]. While 
NATA modeling is useful in estimating relative concen- 
trations in regional scale assessments, in fine scale, 
urban analyses, estimates are subject to limited spatial 
resolution of area and mobile sources in the National 
Emissions Inventory [51]. Similarly, using few central- 
site regulatory monitors for exposure classification limits 



the ability to assess near source concentration gradients, 
such as near roadways [15]. Prior air toxics assessments 
conducted in New York City using fixed site and per- 
sonal monitoring have provided important data on 
indoor, outdoor, and personal exposures among cohorts 
in specific neighborhoods [13,36] but have not offered 
comprehensive assessments across the City. 

City- wide average temporally adjusted springtime mea- 
surements of benzene correspond to concentrations be- 
tween EPAs 1 in 10^ and 10^ lifetime cancer risk 
benchmarks [52]. Average formaldehyde levels in this 
study correspond to concentrations above the EPA 1 in 
10^ lifetime cancer risk benchmark [53]. While risk bench- 
marks are based on continuous exposures experienced 
over a lifetime, these springtime results suggest HAPs may 
contribute meaningfully to cancer and other health risks 
among large populations of New Yorkers who reside in 
close proximity to traffic and other local emission sources. 

An important limitation to these results is that data 
was collected during a single spring season. Pollutant 
concentrations observed may differ in other seasons, 
particularly for formaldehyde where differences in 
photochemical activity will affect secondary formation. 
However, spatial variation should be consistent through- 
out the year as patterns in source density overall remain 
relatively unchanged over short time periods. As with all 
LUR studies, limited data on specific emitters of VOC 
compounds adds uncertainty to model estimates and 
likely attenuates associations between observed concen- 
trations and source indicators. 

These findings, and those from prior saturation sam- 
pling and land-use regression studies conducted in New 
York City (Clougherty et al. submitted 2012, [19,37]), indi- 
cate many of the neighborhoods impacted by high levels 
of PM2.5 and NO2 exposure may also experience high 
levels of benzene, BTEX and formaldehyde. High traffic 
density contributes to higher levels of both criteria and 
toxic pollutants evaluated here while areas of high build- 
ing density are associated with high PM2.5 and formalde- 
hyde levels. Because most studies of intra-urban spatial 
variation in air pollution exposures have focused on cri- 
teria pollutants, characterizing spatial patterns of exposure 
to common urban air toxics will be valuable in elucidating 
the health effects of individual pollutants in common pol- 
lutant mixtures [54] as well as development of emissions 
reduction strategies that maximize health benefits. 

Conclusions 

In this analysis we used high density air quality monitoring 
and land-use regression methods to estimate variability in 
ambient exposures to benzene, BTEX compounds, and 
formaldehyde in New York City. We found significant 
intra-urban spatial variability in all compounds. Indica- 
tors of motor vehicle traffic, solvent usage, and stationary 
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source combustion explained much of the variability in 
concentrations of these air toxics. Many of the same 
neighborhoods identified by prior studies as being 
impacted by high levels of criteria air pollutants are also 
found to have relatively higher levels of these common 
air toxics due to shared local sources. Characterization of 
these spatial patterns in air toxics will help improve 
understanding of the health effects of individual pollu- 
tants in complex urban air pollution mixtures and de- 
velop targeted air quality management strategies that 
reduce health disparities in pollutant-attributable 
adverse health outcomes. 
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